Genome Sequencing Of Endangered Species
   HOME

TheInfoList



OR:

Genome sequencing of endangered species is the application of Next Generation Sequencing (NGS) technologies in the field of conservative biology, with the aim of generating life history,
demographic Demography () is the statistical study of populations, especially human beings. Demographic analysis examines and measures the dimensions and dynamics of populations; it can cover whole societies or groups defined by criteria such as edu ...
and phylogenetic data of relevance to the management of
endangered wildlife An endangered species is a species that is very likely to become extinct in the near future, either worldwide or in a particular political jurisdiction. Endangered species may be at risk due to factors such as habitat loss, poaching and invas ...
.


Background

In the context of
conservation biology Conservation biology is the study of the conservation of nature and of Earth's biodiversity with the aim of protecting species, their habitats, and ecosystems from excessive rates of extinction and the erosion of biotic interactions. It is an int ...
, genomic technologies such as the production of large-scale sequencing data sets via
DNA sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. Th ...
can be used to highlight the relevant aspects of the biology of wildlife species for which management actions may be required. This may involve the estimation of recent
demographic Demography () is the statistical study of populations, especially human beings. Demographic analysis examines and measures the dimensions and dynamics of populations; it can cover whole societies or groups defined by criteria such as edu ...
events,
genetic variation Genetic variation is the difference in DNA among individuals or the differences between populations. The multiple sources of genetic variation include mutation and genetic recombination. Mutations are the ultimate sources of genetic variation, ...
s, divergence between species and population structure. Genome-wide association studies (GWAS) are useful to examine the role of natural selection at the genome level, to identify the loci associated with fitness, local adaptation, inbreeding, depression or disease susceptibility. The access to all these data and the interrogation of genome-wide variation of SNP markers can help the identification of the genetic changes that influence the fitness of wild species and are also important to evaluate the potential respond to changing environments. NGS projects are expected to rapidly increase the number of threatened species for which assembled genomes and detailed information on sequence variation are available and the data will advance investigations relevant to the conservation of biological diversity.


Methodology


Non-computational methods

The traditional approaches in the preservation of endangered species are
captive breeding Captive breeding, also known as captive propagation, is the process of plants or animals in controlled environments, such as wildlife reserves, zoos, botanic gardens, and other conservation facilities. It is sometimes employed to help species that ...
and the
private farming Private or privates may refer to: Music * "In Private", by Dusty Springfield from the 1990 album ''Reputation'' * Private (band), a Denmark-based band * "Private" (Ryōko Hirosue song), from the 1999 album ''Private'', written and also recorded ...
. In some cases those methods led to great results, but some problems still remain. For example, by inbreeding only few individuals, the genetic pool of a subpopulation remains limited or may decrease.


Phylogenetic analysis and gene family estimation

Genetic analyses can remove subjective elements from the determination of the phyliogenetic relationship between organisms. Considering the great variety of information provided by living organisms, it is clear that the type of data will affect both the method of treatment and validity of the results: the higher the correlation of data and genotype, the greater is the validity likely to be. The data analysis can be used to compared different sequencing database and find similar sequences, or similar protein in different species. The comparison can be done using informatic software based on alignment to know the divergence between different species and evaluate the similarities.


NGS/Advanced sequencing methodologies

Since whole-genome sequencing is generally very data-intensive, techniques for reduced representation genomic approaches are sometimes used for practical applications. For example, restriction site-associated DNA sequencing ( RADseq) and double digest RADseq are being developed. With those techniques researchers can target different numbers of loci. With a statistical and bioinformatic approach scientists can make considerations about big genomes, by just focusing on a small representative part of it.


Statistical and computational methods

While solving biological problems, one encounters multiple types of genomic data or sometimes an aggregate of same type of data across multiple studies and decoding such huge amount of data manually is unfeasible and tedious. Therefore, integrated analysis of genomic data using statistical methods has become popular. The rapid advancement in high throughput technologies allows researchers to answer more complex biological questions enabling the development of statistical methods in integrated genomics to establish more effective therapeutic strategies for human disease.


Genome crucial features

While studying the genome, there are some crucial aspects that should be taken in consideration. Gene prediction is the identification of genetic elements in a genomic sequence. This study is based on a combination of approaches: de novo, homology prediction, and transcription. Tools such as EvidenceModeler are used to merge the different results. Gene structure also have been compared, including mRNA length,
exon An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequen ...
length, intron length, exon number, and non-coding RNA. Analysis of repeated sequences has been found useful in reconstructing species divergence timelines.


Application and case studies


Genomic approach in gender determination

In order to preserve a specie, knowledge of the mating system is crucial: scientists can stabilize wild populations through captive breeding, followed by the release in the environment of new individuals. This task is particularly difficult by considering the species with homomorphic sex chromosomes and a large genome. For example, in the case of
amphibian Amphibians are tetrapod, four-limbed and ectothermic vertebrates of the Class (biology), class Amphibia. All living amphibians belong to the group Lissamphibia. They inhabit a wide variety of habitats, with most species living within terres ...
s, there are multiple transitions among male and/or female heterogamety. Sometimes even variation of sex chromosomes within amphibian populations of the same specie were reported.


Japanese giant salamander

The multiple transitions among XY and ZW systems that occur in amphibians determine the sex chromosome systems to be labile in salamanders populations. By understanding the chromosomal basis of sex of those species, it is possible to reconstruct the phylogenetic history of those families and use more efficient strategies in their conservation. By using the ddRADseq method scientists found new sex-related loci in a 56 Gb genome of the family Cryptobranchidae. Their results support the hypothesis of female heterogamety of this species. These loci were confirmed through the bioinformatic analysis of presence/absence of that genetic locus in sex-determined individuals. Their sex was established previously by ultrasound, laparoscopy and measuring serum calcium level differences. The determination of those candidate sexual loci was performed so as to test hypotheses of both female heterogamety and male hetegogamety. Finally to evaluate the validity of those loci, they were amplified through PCR directly from samples of known-sex individuals. This final step led to the demonstration of female heterogamety of several divergent populations of the family Cryptobranchidae.


Genomic approach in genetic variability


Dryas monkey and golden snub-nosed monkey

A recent study used whole-genome sequencing data to demonstrate the sister lineage between the Dryas monkey and vervet monkey and their divergence with additional bidirectional
gene flow In population genetics, gene flow (also known as gene migration or geneflow and allele flow) is the transfer of genetic material from one population to another. If the rate of gene flow is high enough, then two populations will have equivalent a ...
approximately 750,000 to approximately 500,000 years ago. With <250 remaining adult individuals, the study showed high genetic diversity and low levels of inbreeding and genetic load in the studied Dryas monkey individuals. Another study used several techniques such as single-molecule real time sequencing, paired-end sequencing, optical maps, and high-throughput chromosome conformation capture to obtain a high quality chromosome assembly from already constructed incomplete and fragmented genome assembly for the
golden snub-nosed monkey The golden snub-nosed monkey (''Rhinopithecus roxellana'') is an Old World monkey in the subfamily Colobinae. It is endemic to a small area in temperate, mountainous forests of central and Southwest China. They inhabit these mountainous forests ...
. The modern techniques used in this study represented 100-fold improvement in the genome with 22,497 protein-coding genes, of which majority were functionally annotated. The reconstructed genome showed a close relationship between the species and the Rhesus macaque, indicating a divergence approximately 13.4 million years ago.


Genomic approach in preservation


Plants

Plants species identified as PSESP ("plant species with extremely small population") have been the focus of genomic studies, with the aim of determining the most endangered populations. The DNA genome can be sequenced starting from the fresh leaves by doing a DNA extraction. The combination of different sequencing techniques together can be used to obtain a high quality data that can be used to assembly the genome. The RNA extraction is essential for the transcriptome assembly and the extraction process start from stem, roots, fruits, buds and leaves. The ''de novo'' genome assembly can be performed using software to optimize assembly and scaffolding. The software can also be used to fill the gaps and reduce the interaction between chromosome. The combination of different data can be used for the identification of orthologous gene with different species, phylogenetic tree construction, and interspecific genome comparisons.


Limits and future perspectives

The development of indirect sequencing methods has to some degree mitigated the lack of efficient DNA sequencing technologies. These techniques allowed researchers to increase scientific knowledge in fields like ecology and evolution. Several genetic markers, more or less well suited for the purpose, were developed helping researchers to address many issues among which demography and mating systems, population structures and phylogeography, speciational processes and species differences, hybridization and introgression, phylogenetics at many temporal scales. However, all these approaches had a primary deficiency: they were all limited only to a fraction of the entire genome so that genome-wide parameters were inferred from a tiny amount of genetic material. The invention and rising of DNA sequencing methods brought a huge contribution in increasing available data potentially useful to improve the field of
conservation biology Conservation biology is the study of the conservation of nature and of Earth's biodiversity with the aim of protecting species, their habitats, and ecosystems from excessive rates of extinction and the erosion of biotic interactions. It is an int ...
. The ongoing development of cheaper and high throughput allowed the production of a wide array of information in several disciplines providing conservation biologists a very powerful databank from which was possible to extrapolate useful information about, for example, population structure, genetic connections, identification of potential risks due to demographic changes and inbreeding processes through population-genomic approaches that rely on the detection of SNPs, indel or CNV. From one side of the coin, data derived from high throughput sequencing of whole genomes were potentially a massive advance in the field of species conservation, opening wide doors for future challenges and opportunities. On the other side all these data brought researchers to face two main issues. First, how to process all these information. Second, how to translate all the available information into conservation's strategies and practice or, in other words, how to fill the gap between genomic researches and conservation application. Unfortunately, there are many analytical and practical problems to consider using approaches involving genome-wide sequencing. Availability of samples is a major limiting factor: sampling procedures may disturb an already fragile population or may have a big impact in individual animals itself putting limitations to samples' collection. For these reasons several alternative strategies where developed: constant monitoring, for example with radio collars, allow us to understand the behaviour and develop strategies to obtain genetic samples and management of the endangered populations. The samples taken from those species are then used to produce primary cell culture from biopsies. Indeed, this kind of material allow us to grow in vitro cells, and allow us to extract and study genetic material without constantly sampling the endangered populations. Despite a faster and easier data production and a continuous improvement of sequencing technologies, there is still a marked delay of data analysis and processing techniques. Genome-wide analysis and big genomes studies require advances in bioinformatics and
computational biology Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
. At the same time improvements in the statistical programs and in the population genetics are required to make better conservation strategies. This last aspect work in parallel with prediction strategies which should take in consideration all features that determine fitness of a species.


See also

*
Endangered species An endangered species is a species that is very likely to become extinct in the near future, either worldwide or in a particular political jurisdiction. Endangered species may be at risk due to factors such as habitat loss, poaching and inv ...


References

{{reflist Biotechnology Conservation biology Ecology Endangered species Extinction events Genomics techniques